35 research outputs found

    Using the data quality dashboard to improve the ehden network

    Get PDF
    Federated networks of observational health databases have the potential to be a rich resource to inform clinical practice and regulatory decision making. However, the lack of standard data quality processes makes it difficult to know if these data are research ready. The EHDEN COVID-19 Rapid Collaboration Call presented the opportunity to assess how the newly developed open-source tool Data Quality Dashboard (DQD) informs the quality of data in a federated network. Fifteen Data Partners (DPs) from 10 different countries worked with the EHDEN taskforce to map their data to the OMOP CDM. Throughout the process at least two DQD results were collected and compared for each DP. All DPs showed an improvement in their data quality between the first and last run of the DQD. The DQD excelled at helping DPs identify and fix conformance issues but showed less of an impact on completeness and plausibility checks. This is the first study to apply the DQD on multiple, disparate databases across a network. While study-specific checks should still be run, we recommend that all data holders converting their data to the OMOP CDM use the DQD as it ensures conformance to the model specifications and that a database meets a baseline level of completeness and plausibility for use in research.</p

    A standardized analytics pipeline for reliable and rapid development and validation of prediction models using observational health data

    Get PDF
    Background and objective: As a response to the ongoing COVID-19 pandemic, several prediction models in the existing literature were rapidly developed, with the aim of providing evidence-based guidance. However, none of these COVID-19 prediction models have been found to be reliable. Models are commonly assessed to have a risk of bias, often due to insufficient reporting, use of non-representative data, and lack of large-scale external validation. In this paper, we present the Observational Health Data Sciences and Informatics (OHDSI) analytics pipeline for patient-level prediction modeling as a standardized approach for rapid yet reliable development and validation of prediction models. We demonstrate how our analytics pipeline and open-source software tools can be used to answer important prediction questions while limiting potential causes of bias (e.g., by validating phenotypes, specifying the target population, performing large-scale external validation, and publicly providing all analytical source code). Methods: We show step-by-step how to implement the analytics pipeline for the question: ‘In patients hospitalized with COVID-19, what is the risk of death 0 to 30 days after hospitalization?’. We develop models using six different machine learning methods in a USA claims database containing over 20,000 COVID-19 hospitalizations and externally validate the models using data containing over 45,000 COVID-19 hospitalizations from South Korea, Spain, and the USA. Results: Our open-source software tools enabled us to efficiently go end-to-end from problem design to reliable Model Development and evaluation. When predicting death in patients hospitalized with COVID-19, AdaBoost, random forest, gradient boosting machine, and decision tree yielded similar or lower internal and external validation discrimination performance compared to L1-regularized logistic regression, whereas the MLP neural network consistently resulted in lower discrimination. L1-regularized logistic regression models were well calibrated. Conclusion: Our results show that following the OHDSI analytics pipeline for patient-level prediction modelling can enable the rapid development towards reliable prediction models. The OHDSI software tools and pipeline are open source and available to researchers from all around the world.</p

    International cohort study indicates no association between alpha-1 blockers and susceptibility to COVID-19 in benign prostatic hyperplasia patients

    Get PDF
    Purpose: Alpha-1 blockers, often used to treat benign prostatic hyperplasia (BPH), have been hypothesized to prevent COVID-19 complications by minimising cytokine storm release. The proposed treatment based on this hypothesis currently lacks support from reliable real-world evidence, however. We leverage an international network of large-scale healthcare databases to generate comprehensive evidence in a transparent and reproducible manner.Methods: In this international cohort study, we deployed electronic health records from Spain (SIDIAP) and the United States (Department of Veterans Affairs, Columbia University Irving Medical Center, IQVIA OpenClaims, Optum DOD, Optum EHR). We assessed association between alpha-1 blocker use and risks of three COVID-19 outcomes-diagnosis, hospitalization, and hospitalization requiring intensive services-using a prevalent-user active-comparator design. We estimated hazard ratios using state-of-the-art techniques to minimize potential confounding, including large-scale propensity score matching/stratification and negative control calibration. We pooled database-specific estimates through random effects meta-analysis.Results: Our study overall included 2.6 and 0.46 million users of alpha-1 blockers and of alternative BPH medications. We observed no significant difference in their risks for any of the COVID-19 outcomes, with our meta-analytic HR estimates being 1.02 (95% CI: 0.92-1.13) for diagnosis, 1.00 (95% CI: 0.89-1.13) for hospitalization, and 1.15 (95% CI: 0.71-1.88) for hospitalization requiring intensive services.Conclusion: We found no evidence of the hypothesized reduction in risks of the COVID-19 outcomes from the prevalent-use of alpha-1 blockers-further research is needed to identify effective therapies for this novel disease.</p

    Phenotype Algorithms for the Identification and Characterization of Vaccine-Induced Thrombotic Thrombocytopenia in Real World Data: A Multinational Network Cohort Study

    Get PDF
    INTRODUCTION: Vaccine-induced thrombotic thrombocytopenia (VITT) has been identified as a rare but serious adverse event associated with coronavirus disease 2019 (COVID-19) vaccines. OBJECTIVES: In this study, we explored the pre-pandemic co-occurrence of thrombosis with thrombocytopenia (TWT) using 17 observational health data sources across the world. We applied multiple TWT definitions, estimated the background rate of TWT, characterized TWT patients, and explored the makeup of thrombosis types among TWT patients. METHODS: We conducted an international network retrospective cohort study using electronic health records and insurance claims data, estimating background rates of TWT amongst persons observed from 2017 to 2019. Following the principles of existing VITT clinical definitions, TWT was defined as patients with a diagnosis of embolic or thrombotic arterial or venous events and a diagnosis or measurement of thrombocytopenia within 7 days. Six TWT phenotypes were considered, which varied in the approach taken in defining thrombosis and thrombocytopenia in real world data. RESULTS: Overall TWT incidence rates ranged from 1.62 to 150.65 per 100,000 person-years. Substantial heterogeneity exists across data sources and by age, sex, and alternative TWT phenotypes. TWT patients were likely to be men of older age with various comorbidities. Among the thrombosis types, arterial thrombotic events were the most common. CONCLUSION: Our findings suggest that identifying VITT in observational data presents a substantial challenge, as implementing VITT case definitions based on the co-occurrence of TWT results in large and heterogeneous incidence rate and in a cohort of patints with baseline characteristics that are inconsistent with the VITT cases reported to date

    Phenotype Algorithms for the Identification and Characterization of Vaccine-Induced Thrombotic Thrombocytopenia in Real World Data:A Multinational Network Cohort Study

    Get PDF
    INTRODUCTION: Vaccine-induced thrombotic thrombocytopenia (VITT) has been identified as a rare but serious adverse event associated with coronavirus disease 2019 (COVID-19) vaccines. OBJECTIVES: In this study, we explored the pre-pandemic co-occurrence of thrombosis with thrombocytopenia (TWT) using 17 observational health data sources across the world. We applied multiple TWT definitions, estimated the background rate of TWT, characterized TWT patients, and explored the makeup of thrombosis types among TWT patients. METHODS: We conducted an international network retrospective cohort study using electronic health records and insurance claims data, estimating background rates of TWT amongst persons observed from 2017 to 2019. Following the principles of existing VITT clinical definitions, TWT was defined as patients with a diagnosis of embolic or thrombotic arterial or venous events and a diagnosis or measurement of thrombocytopenia within 7 days. Six TWT phenotypes were considered, which varied in the approach taken in defining thrombosis and thrombocytopenia in real world data. RESULTS: Overall TWT incidence rates ranged from 1.62 to 150.65 per 100,000 person-years. Substantial heterogeneity exists across data sources and by age, sex, and alternative TWT phenotypes. TWT patients were likely to be men of older age with various comorbidities. Among the thrombosis types, arterial thrombotic events were the most common. CONCLUSION: Our findings suggest that identifying VITT in observational data presents a substantial challenge, as implementing VITT case definitions based on the co-occurrence of TWT results in large and heterogeneous incidence rate and in a cohort of patints with baseline characteristics that are inconsistent with the VITT cases reported to date. SUPPLEMENTARY INFORMATION: The online version contains supplementary material available at 10.1007/s40264-022-01187-y

    Multinational patterns of second line antihyperglycaemic drug initiation across cardiovascular risk groups:federated pharmacoepidemiological evaluation in LEGEND-T2DM

    Get PDF
    Objective: To assess the uptake of second line antihyperglycaemic drugs among patients with type 2 diabetes mellitus who are receiving metformin.Design: Federated pharmacoepidemiological evaluation in LEGEND-T2DM.Setting: 10 US and seven non-US electronic health record and administrative claims databases in the Observational Health Data Sciences and Informatics network in eight countries from 2011 to the end of 2021.Participants: 4.8 million patients (≥18 years) across US and non-US based databases with type 2 diabetes mellitus who had received metformin monotherapy and had initiated second line treatments.Exposure: The exposure used to evaluate each database was calendar year trends, with the years in the study that were specific to each cohort.Main outcomes measures: The outcome was the incidence of second line antihyperglycaemic drug use (ie, glucagon-like peptide-1 receptor agonists, sodium-glucose cotransporter-2 inhibitors, dipeptidyl peptidase-4 inhibitors, and sulfonylureas) among individuals who were already receiving treatment with metformin. The relative drug class level uptake across cardiovascular risk groups was also evaluated.Results: 4.6 million patients were identified in US databases, 61 382 from Spain, 32 442 from Germany, 25 173 from the UK, 13 270 from France, 5580 from Scotland, 4614 from Hong Kong, and 2322 from Australia. During 2011-21, the combined proportional initiation of the cardioprotective antihyperglycaemic drugs (glucagon-like peptide-1 receptor agonists and sodium-glucose cotransporter-2 inhibitors) increased across all data sources, with the combined initiation of these drugs as second line drugs in 2021 ranging from 35.2% to 68.2% in the US databases, 15.4% in France, 34.7% in Spain, 50.1% in Germany, and 54.8% in Scotland. From 2016 to 2021, in some US and non-US databases, uptake of glucagon-like peptide-1 receptor agonists and sodium-glucose cotransporter-2 inhibitors increased more significantly among populations with no cardiovascular disease compared with patients with established cardiovascular disease. No data source provided evidence of a greater increase in the uptake of these two drug classes in populations with cardiovascular disease compared with no cardiovascular disease.Conclusions: Despite the increase in overall uptake of cardioprotective antihyperglycaemic drugs as second line treatments for type 2 diabetes mellitus, their uptake was lower in patients with cardiovascular disease than in people with no cardiovascular disease over the past decade. A strategy is needed to ensure that medication use is concordant with guideline recommendations to improve outcomes of patients with type 2 diabetes mellitus.</p

    Characteristics and outcomes of over 300,000 patients with COVID-19 and history of cancer in the United States and Spain

    Get PDF
    Background: We described the demographics, cancer subtypes, comorbidities, and outcomes of patients with a history of cancer and coronavirus disease 2019 (COVID-19). Second, we compared patients hospitalized with COVID-19 to patients diagnosed with COVID-19 and patients hospitalized with influenza. Methods: We conducted a cohort study using eight routinely collected health care databases from Spain and the United States, standardized to the Observational Medical Outcome Partnership common data model. Three cohorts of patients with a history of cancer were included: (i) diagnosed with COVID-19, (ii) hospitalized with COVID-19, and (iii) hospitalized with influenza in 2017 to 2018. Patients were followed from index date to 30 days or death. We reported demographics, cancer subtypes, comorbidities, and 30-day outcomes. Results: We included 366,050 and 119,597 patients diagnosed and hospitalized with COVID-19, respectively. Prostate and breast cancers were the most frequent cancers (range: 5%–18% and 1%–14% in the diagnosed cohort, respectively). Hematologic malignancies were also frequent, with non-Hodgkin’s lymphoma being among the five most common cancer subtypes in the diagnosed cohort. Overall, patients were aged above 65 years and had multiple comorbidities. Occurrence of death ranged from 2% to 14% and from 6% to 26% in the diagnosed and hospitalized COVID-19 cohorts, respectively. Patients hospitalized with influenza (n ¼ 67,743) had a similar distribution of cancer subtypes, sex, age, and comorbidities but lower occurrence of adverse events. Conclusions: Patients with a history of cancer and COVID-19 had multiple comorbidities and a high occurrence of COVID-19-related events. Hematologic malignancies were frequent. Impact: This study provides epidemiologic characteristics that can inform clinical care and etiologic studies.</p

    The National COVID Cohort Collaborative (N3C): Rationale, design, infrastructure, and deployment.

    Get PDF
    OBJECTIVE: Coronavirus disease 2019 (COVID-19) poses societal challenges that require expeditious data and knowledge sharing. Though organizational clinical data are abundant, these are largely inaccessible to outside researchers. Statistical, machine learning, and causal analyses are most successful with large-scale data beyond what is available in any given organization. Here, we introduce the National COVID Cohort Collaborative (N3C), an open science community focused on analyzing patient-level data from many centers. MATERIALS AND METHODS: The Clinical and Translational Science Award Program and scientific community created N3C to overcome technical, regulatory, policy, and governance barriers to sharing and harmonizing individual-level clinical data. We developed solutions to extract, aggregate, and harmonize data across organizations and data models, and created a secure data enclave to enable efficient, transparent, and reproducible collaborative analytics. RESULTS: Organized in inclusive workstreams, we created legal agreements and governance for organizations and researchers; data extraction scripts to identify and ingest positive, negative, and possible COVID-19 cases; a data quality assurance and harmonization pipeline to create a single harmonized dataset; population of the secure data enclave with data, machine learning, and statistical analytics tools; dissemination mechanisms; and a synthetic data pilot to democratize data access. CONCLUSIONS: The N3C has demonstrated that a multisite collaborative learning health network can overcome barriers to rapidly build a scalable infrastructure incorporating multiorganizational clinical data for COVID-19 analytics. We expect this effort to save lives by enabling rapid collaboration among clinicians, researchers, and data scientists to identify treatments and specialized care and thereby reduce the immediate and long-term impacts of COVID-19

    Using the Data Quality Dashboard to Improve the EHDEN Network

    No full text
    Federated networks of observational health databases have the potential to be a rich resource to inform clinical practice and regulatory decision making. However, the lack of standard data quality processes makes it difficult to know if these data are research ready. The EHDEN COVID-19 Rapid Collaboration Call presented the opportunity to assess how the newly developed open-source tool Data Quality Dashboard (DQD) informs the quality of data in a federated network. Fifteen Data Partners (DPs) from 10 different countries worked with the EHDEN taskforce to map their data to the OMOP CDM. Throughout the process at least two DQD results were collected and compared for each DP. All DPs showed an improvement in their data quality between the first and last run of the DQD. The DQD excelled at helping DPs identify and fix conformance issues but showed less of an impact on completeness and plausibility checks. This is the first study to apply the DQD on multiple, disparate databases across a network. While study-specific checks should still be run, we recommend that all data holders converting their data to the OMOP CDM use the DQD as it ensures conformance to the model specifications and that a database meets a baseline level of completeness and plausibility for use in research
    corecore